Nonlinear Models Using Dirichlet Process Mixtures
نویسندگان
چکیده
We introduce a new nonlinear model for classification, in which we model the joint distribution of response variable, y, and covariates, x, non-parametrically using Dirichlet process mixtures. We keep the relationship between y and x linear within each component of the mixture. The overall relationship becomes nonlinear if the mixture contains more than one component, with different regression coefficients. We use simulated data to compare the performance of this new approach to alternative methods such as multinomial logit (MNL) models, decision trees, and support vector machines. We also evaluate our approach on two classification problems: identifying the folding class of protein sequences and detecting Parkinson’s disease. Our model can sometimes improve predictive accuracy. Moreover, by grouping observations into sub-populations (i.e., mixture components), our model can sometimes provide insight into hidden structure in the data.
منابع مشابه
Clustering in Additive Mixed Models with Approximate Dirichlet Process Mixtures using the EM Algorithm
SUMMARY: We consider additive mixed models for longitudinal data with a nonlinear time trend. As random effects distribution an approximate Dirichlet process mixture is proposed that is based on the truncated version of the stick breaking presentation of the Dirichlet process and provides a Gaussian mixture with a data driven choice of the number of mixture components. The main advantage of the...
متن کاملImproving Classification Models When a Class Hierarchy Is Available
Improving classification models when a class hierarchy is available Babak Shahbaba Doctor of Philosophy Graduate Department of Public Health Sciences University of Toronto 2007 We introduce a new method for modeling hierarchical classes, when we have prior knowledge of how these classes can be arranged in a hierarchy. The application of this approach is discussed for linear models, as well as n...
متن کاملPractical Bayesian inference using mixtures of mixtures.
Discrete mixtures of normal distributions are widely used in modeling amplitude fluctuations of electrical potentials at synapses of human and other animal nervous systems. The usual framework has independent data values yj arising as yj = mu j + xn0 + j, where the means mu j come from some discrete prior G(mu) and the unknown xno + j's and observed xj, j = 1,...,n0, are Gaussian noise terms. A...
متن کاملIntroducing of Dirichlet process prior in the Nonparametric Bayesian models frame work
Statistical models are utilized to learn about the mechanism that the data are generating from it. Often it is assumed that the random variables y_i,i=1,…,n ,are samples from the probability distribution F which is belong to a parametric distributions class. However, in practice, a parametric model may be inappropriate to describe the data. In this settings, the parametric assumption could be r...
متن کاملA Novel Information-Theoretic Approach for Variable Clustering and Predictive Modeling Using Dirichlet Process Mixtures
In the era of big data, there are increasing interests on clustering variables for the minimization of data redundancy and the maximization of variable relevancy. Existing clustering methods, however, depend on nontrivial assumptions about the data structure. Note that nonlinear interdependence among variables poses significant challenges on the traditional framework of predictive modeling. In ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Machine Learning Research
دوره 10 شماره
صفحات -
تاریخ انتشار 2009